Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support yaml merge operator #1040

Merged
merged 7 commits into from
Sep 29, 2024
Merged

Support yaml merge operator #1040

merged 7 commits into from
Sep 29, 2024

Conversation

msbarry
Copy link
Contributor

@msbarry msbarry commented Sep 28, 2024

Support yaml merge operator so you can do things like this in custom yaml files to reuse a block and override values in it.

source: &label
  key: value
dest:
  <<: *label
  other_key: other_value

Fixes #1038

Copy link

github-actions bot commented Sep 28, 2024

This Branch b1c3a24 Base b51be48
0:01:09 DEB [archive] - Tile stats:
0:01:09 DEB [archive] - Biggest tiles (gzipped)
1. 14/4942/6092 (159k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.40015 (poi:85k)
2. 9/154/190 (149k) https://onthegomap.github.io/planetiler-demo/#9.5/41.77078/-71.36719 (landcover:85k)
3. 10/308/380 (139k) https://onthegomap.github.io/planetiler-demo/#10.5/41.90214/-71.54297 (landcover:66k)
4. 10/308/381 (137k) https://onthegomap.github.io/planetiler-demo/#10.5/41.63994/-71.54297 (landcover:72k)
5. 14/4941/6092 (113k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.42212 (poi:65k)
6. 14/4941/6093 (111k) https://onthegomap.github.io/planetiler-demo/#14.5/41.81227/-71.42212 (building:62k)
7. 14/4940/6092 (100k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.44409 (building:92k)
8. 11/616/762 (99k) https://onthegomap.github.io/planetiler-demo/#11.5/41.7057/-71.63086 (landcover:71k)
9. 14/4942/6091 (97k) https://onthegomap.github.io/planetiler-demo/#14.5/41.84501/-71.40015 (building:79k)
10. 11/616/761 (96k) https://onthegomap.github.io/planetiler-demo/#11.5/41.83679/-71.63086 (landcover:72k)
0:01:09 DEB [archive] - Max tile sizes
                      z0    z1    z2    z3    z4    z5    z6    z7    z8    z9   z10   z11   z12   z13   z14   all
           boundary  155   375   444   584   939   341   435   550   775  1.6k  2.1k  7.2k  6.4k  5.8k  4.5k  7.2k
              water 7.7k  3.7k  8.6k  5.5k  2.6k  5.1k   15k   18k   16k   26k   15k   13k   17k   15k   12k   26k
              place    0     0   441   441   441   640   714    1k  1.6k  3.1k  5.7k  3.3k  1.7k   803   948  5.7k
            landuse    0     0     0     0   549   695  1.6k  6.8k   17k   44k   59k   50k   38k   19k   12k   59k
     transportation    0     0     0     0   314   850  1.2k    6k    8k   24k   17k   19k   65k   49k   36k   65k
           waterway    0     0     0     0   112   119     0     0     0  3.2k  2.3k  2.1k  2.1k  4.9k  2.4k  4.9k
               park    0     0     0     0     0     0  1.2k    4k  9.7k   19k   13k  8.2k  4.3k  3.4k  4.4k   19k
transportation_name    0     0     0     0     0     0   369   464  1.2k  1.8k  5.5k  4.7k  3.9k  3.4k   18k   18k
          landcover    0     0     0     0     0     0     0  9.6k   29k   85k   72k   81k   53k   30k   26k   85k
      mountain_peak    0     0     0     0     0     0     0  1.1k  1.8k  3.4k  4.3k  2.8k  1.4k  1.4k   869  4.3k
         water_name    0     0     0     0     0     0     0     0     0   486   461   433   452  1.2k  1.5k  1.5k
    aerodrome_label    0     0     0     0     0     0     0     0     0     0   666   328   273   221   221   666
            aeroway    0     0     0     0     0     0     0     0     0     0  1.6k  2.1k    3k  3.4k  2.8k  3.4k
                poi    0     0     0     0     0     0     0     0     0     0     0     0   506   503   85k   85k
           building    0     0     0     0     0     0     0     0     0     0     0     0     0   59k   92k   92k
        housenumber    0     0     0     0     0     0     0     0     0     0     0     0     0     0   35k   35k
          full tile 7.9k    4k  9.5k  6.5k  3.8k  6.1k   20k   42k   85k  203k  185k  135k  114k  129k  251k  251k
            gzipped 6.2k  3.6k  7.1k  5.2k  3.1k  4.9k   14k   29k   60k  149k  139k   99k   83k   92k  159k  159k
0:01:09 DEB [archive] -    Max tile: 251k (gzipped: 159k)
0:01:09 DEB [archive] -    Avg tile: 5.4k (gzipped: 4.1k) using weighted average based on OSM traffic
0:01:09 DEB [archive] -     # tiles: 4,115,029
0:01:09 DEB [archive] -  # features: 5,495,214
0:01:09 INF [archive] - Finished in 19s cpu:1m9s avg:3.7
0:01:09 INF [archive] -   read    1x(3% 0.6s wait:17s done:1s)
0:01:09 INF [archive] -   encode  4x(56% 11s wait:2s done:1s)
0:01:09 INF [archive] -   write   1x(21% 4s wait:13s done:1s)
0:01:09 INF [archive] - Finished in 1m10s cpu:3m39s gc:1s avg:3.1
0:01:09 INF [archive] - FINISHED!
0:01:09 INF [archive] - 
0:01:09 INF [archive] - ----------------------------------------
0:01:09 INF [archive] - data errors:
0:01:09 INF [archive] - 	render_snap_fix_input	16,666
0:01:09 INF [archive] - 	osm_multipolygon_missing_way	360
0:01:09 INF [archive] - 	osm_boundary_missing_way	73
0:01:09 INF [archive] - 	merge_snap_fix_input	12
0:01:09 INF [archive] - 	feature_centroid_if_convex_osm_invalid_multipolygon_empty_after_fix	2
0:01:09 INF [archive] - 	render_snap_fix_input2	1
0:01:09 INF [archive] - 	omt_fix_water_before_ne_intersect	1
0:01:09 INF [archive] - 	feature_polygon_osm_invalid_multipolygon_empty_after_fix	1
0:01:09 INF [archive] - 	feature_point_on_surface_osm_invalid_multipolygon_empty_after_fix	1
0:01:09 INF [archive] - ----------------------------------------
0:01:09 INF [archive] - 	overall          1m10s cpu:3m39s gc:1s avg:3.1
0:01:09 INF [archive] - 	lake_centerlines 3s cpu:5s avg:2.1
0:01:09 INF [archive] - 	  read     1x(18% 0.5s done:2s)
0:01:09 INF [archive] - 	  process  4x(0% 0s done:2s)
0:01:09 INF [archive] - 	  write    1x(0% 0s done:2s)
0:01:09 INF [archive] - 	water_polygons   15s cpu:43s avg:2.8
0:01:09 INF [archive] - 	  read     1x(38% 6s done:6s)
0:01:09 INF [archive] - 	  process  4x(31% 5s wait:2s done:5s)
0:01:09 INF [archive] - 	  write    1x(4% 0.6s wait:10s done:5s)
0:01:09 INF [archive] - 	natural_earth    11s cpu:17s avg:1.6
0:01:09 INF [archive] - 	  read     1x(57% 6s done:5s)
0:01:09 INF [archive] - 	  process  4x(8% 0.8s wait:6s done:5s)
0:01:09 INF [archive] - 	  write    1x(0% 0s wait:6s done:5s)
0:01:09 INF [archive] - 	osm_pass1        2s cpu:7s avg:3.5
0:01:09 INF [archive] - 	  read     1x(2% 0s wait:2s)
0:01:09 INF [archive] - 	  parse    4x(33% 0.7s)
0:01:09 INF [archive] - 	  process  1x(73% 1s)
0:01:09 INF [archive] - 	osm_pass2        18s cpu:1m12s avg:3.9
0:01:09 INF [archive] - 	  read     1x(0% 0s wait:11s done:8s)
0:01:09 INF [archive] - 	  process  4x(75% 14s)
0:01:09 INF [archive] - 	  write    1x(2% 0.4s wait:18s)
0:01:09 INF [archive] - 	ne_lakes         0s cpu:0s avg:0
0:01:09 INF [archive] - 	boundaries       0s cpu:0s avg:1.3
0:01:09 INF [archive] - 	agg_stop         0s cpu:0s avg:0
0:01:09 INF [archive] - 	sort             1s cpu:3s avg:2.6
0:01:09 INF [archive] - 	  worker  1x(53% 0.7s)
0:01:09 INF [archive] - 	archive          19s cpu:1m9s avg:3.7
0:01:09 INF [archive] - 	  read    1x(3% 0.6s wait:17s done:1s)
0:01:09 INF [archive] - 	  encode  4x(56% 11s wait:2s done:1s)
0:01:09 INF [archive] - 	  write   1x(21% 4s wait:13s done:1s)
0:01:09 INF [archive] - ----------------------------------------
0:01:09 INF [archive] - 	archive	108MB
0:01:09 INF [archive] - 	features	291MB
-rw-r--r-- 1 runner docker 85M Sep 29 09:54 run.jar
0:01:04 DEB [archive] - Tile stats:
0:01:04 DEB [archive] - Biggest tiles (gzipped)
1. 14/4942/6092 (159k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.40015 (poi:85k)
2. 9/154/190 (149k) https://onthegomap.github.io/planetiler-demo/#9.5/41.77078/-71.36719 (landcover:85k)
3. 10/308/380 (139k) https://onthegomap.github.io/planetiler-demo/#10.5/41.90214/-71.54297 (landcover:66k)
4. 10/308/381 (137k) https://onthegomap.github.io/planetiler-demo/#10.5/41.63994/-71.54297 (landcover:72k)
5. 14/4941/6092 (113k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.42212 (poi:65k)
6. 14/4941/6093 (111k) https://onthegomap.github.io/planetiler-demo/#14.5/41.81227/-71.42212 (building:62k)
7. 14/4940/6092 (100k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.44409 (building:92k)
8. 11/616/762 (99k) https://onthegomap.github.io/planetiler-demo/#11.5/41.7057/-71.63086 (landcover:71k)
9. 14/4942/6091 (97k) https://onthegomap.github.io/planetiler-demo/#14.5/41.84501/-71.40015 (building:79k)
10. 11/616/761 (96k) https://onthegomap.github.io/planetiler-demo/#11.5/41.83679/-71.63086 (landcover:72k)
0:01:04 DEB [archive] - Max tile sizes
                      z0    z1    z2    z3    z4    z5    z6    z7    z8    z9   z10   z11   z12   z13   z14   all
           boundary  155   375   444   584   939   341   435   550   775  1.6k  2.1k  7.2k  6.4k  5.8k  4.5k  7.2k
              water 7.7k  3.7k  8.6k  5.5k  2.6k  5.1k   15k   18k   16k   26k   15k   13k   17k   15k   12k   26k
              place    0     0   441   441   441   640   714    1k  1.6k  3.1k  5.7k  3.3k  1.7k   803   948  5.7k
            landuse    0     0     0     0   549   695  1.6k  6.8k   17k   44k   59k   50k   38k   19k   12k   59k
     transportation    0     0     0     0   314   850  1.2k    6k    8k   24k   17k   19k   65k   49k   36k   65k
           waterway    0     0     0     0   112   119     0     0     0  3.2k  2.3k  2.1k  2.1k  4.9k  2.4k  4.9k
               park    0     0     0     0     0     0  1.2k    4k  9.7k   19k   13k  8.2k  4.3k  3.4k  4.4k   19k
transportation_name    0     0     0     0     0     0   369   464  1.2k  1.8k  5.5k  4.7k  3.9k  3.4k   18k   18k
          landcover    0     0     0     0     0     0     0  9.6k   29k   85k   72k   81k   53k   30k   26k   85k
      mountain_peak    0     0     0     0     0     0     0  1.1k  1.8k  3.4k  4.3k  2.8k  1.4k  1.4k   869  4.3k
         water_name    0     0     0     0     0     0     0     0     0   486   461   433   452  1.2k  1.5k  1.5k
    aerodrome_label    0     0     0     0     0     0     0     0     0     0   666   328   273   221   221   666
            aeroway    0     0     0     0     0     0     0     0     0     0  1.6k  2.1k    3k  3.4k  2.8k  3.4k
                poi    0     0     0     0     0     0     0     0     0     0     0     0   506   503   85k   85k
           building    0     0     0     0     0     0     0     0     0     0     0     0     0   59k   92k   92k
        housenumber    0     0     0     0     0     0     0     0     0     0     0     0     0     0   35k   35k
          full tile 7.9k    4k  9.5k  6.5k  3.8k  6.1k   20k   42k   85k  203k  185k  135k  114k  129k  251k  251k
            gzipped 6.2k  3.6k  7.1k  5.2k  3.1k  4.9k   14k   29k   60k  149k  139k   99k   83k   92k  159k  159k
0:01:04 DEB [archive] -    Max tile: 251k (gzipped: 159k)
0:01:04 DEB [archive] -    Avg tile: 5.4k (gzipped: 4.1k) using weighted average based on OSM traffic
0:01:04 DEB [archive] -     # tiles: 4,115,029
0:01:04 DEB [archive] -  # features: 5,495,214
0:01:04 INF [archive] - Finished in 19s cpu:1m10s avg:3.7
0:01:04 INF [archive] -   read    1x(3% 0.6s wait:17s done:1s)
0:01:04 INF [archive] -   encode  4x(56% 11s wait:2s)
0:01:04 INF [archive] -   write   1x(21% 4s wait:13s)
0:01:04 INF [archive] - Finished in 1m4s cpu:3m30s gc:1s avg:3.3
0:01:04 INF [archive] - FINISHED!
0:01:04 INF [archive] - 
0:01:04 INF [archive] - ----------------------------------------
0:01:04 INF [archive] - data errors:
0:01:04 INF [archive] - 	render_snap_fix_input	16,666
0:01:04 INF [archive] - 	osm_multipolygon_missing_way	360
0:01:04 INF [archive] - 	osm_boundary_missing_way	73
0:01:04 INF [archive] - 	merge_snap_fix_input	12
0:01:04 INF [archive] - 	feature_centroid_if_convex_osm_invalid_multipolygon_empty_after_fix	2
0:01:04 INF [archive] - 	render_snap_fix_input2	1
0:01:04 INF [archive] - 	omt_fix_water_before_ne_intersect	1
0:01:04 INF [archive] - 	feature_polygon_osm_invalid_multipolygon_empty_after_fix	1
0:01:04 INF [archive] - 	feature_point_on_surface_osm_invalid_multipolygon_empty_after_fix	1
0:01:04 INF [archive] - ----------------------------------------
0:01:04 INF [archive] - 	overall          1m4s cpu:3m30s gc:1s avg:3.3
0:01:04 INF [archive] - 	lake_centerlines 2s cpu:5s avg:2.4
0:01:04 INF [archive] - 	  read     1x(22% 0.5s done:2s)
0:01:04 INF [archive] - 	  process  4x(0% 0s done:2s)
0:01:04 INF [archive] - 	  write    1x(0% 0s done:2s)
0:01:04 INF [archive] - 	water_polygons   15s cpu:41s avg:2.7
0:01:04 INF [archive] - 	  read     1x(39% 6s done:7s)
0:01:04 INF [archive] - 	  process  4x(30% 4s wait:2s done:5s)
0:01:04 INF [archive] - 	  write    1x(4% 0.6s wait:9s done:5s)
0:01:04 INF [archive] - 	natural_earth    6s cpu:12s avg:2
0:01:04 INF [archive] - 	  read     1x(96% 6s)
0:01:04 INF [archive] - 	  process  4x(13% 0.8s wait:6s)
0:01:04 INF [archive] - 	  write    1x(0% 0s wait:6s)
0:01:04 INF [archive] - 	osm_pass1        2s cpu:7s avg:3.3
0:01:04 INF [archive] - 	  read     1x(2% 0s wait:2s)
0:01:04 INF [archive] - 	  parse    4x(33% 0.7s)
0:01:04 INF [archive] - 	  process  1x(71% 1s)
0:01:04 INF [archive] - 	osm_pass2        18s cpu:1m10s avg:3.9
0:01:04 INF [archive] - 	  read     1x(0% 0s wait:11s done:7s)
0:01:04 INF [archive] - 	  process  4x(77% 14s)
0:01:04 INF [archive] - 	  write    1x(2% 0.4s wait:17s)
0:01:04 INF [archive] - 	ne_lakes         0s cpu:0s avg:0
0:01:04 INF [archive] - 	boundaries       0s cpu:0s avg:2.7
0:01:04 INF [archive] - 	agg_stop         0s cpu:0s avg:0
0:01:04 INF [archive] - 	sort             1s cpu:3s avg:2.6
0:01:04 INF [archive] - 	  worker  1x(53% 0.7s)
0:01:04 INF [archive] - 	archive          19s cpu:1m10s avg:3.7
0:01:04 INF [archive] - 	  read    1x(3% 0.6s wait:17s done:1s)
0:01:04 INF [archive] - 	  encode  4x(56% 11s wait:2s)
0:01:04 INF [archive] - 	  write   1x(21% 4s wait:13s)
0:01:04 INF [archive] - ----------------------------------------
0:01:04 INF [archive] - 	archive	108MB
0:01:04 INF [archive] - 	features	291MB
-rw-r--r-- 1 runner docker 85M Sep 29 09:56 run.jar

Full logs: https://github.com/onthegomap/planetiler/actions/runs/11091650043

*/
private static void handleMergeOperator(Object parsed) {
if (parsed instanceof Map<?, ?> map) {
Object toMerge = map.remove("<<");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this code, or similar, be used check if the merge operator is the first key in the mapping?!

Suggested change
Object toMerge = map.remove("<<");
// Check if the first key is a merge operator using an iterator
Iterator<String> keyIterator = map.keySet().iterator();
Object toMerge = null;
if (keyIterator.next().equals("<<")) {
toMerge = map.remove("<<");
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, when I read the merge operator draft again, it does not prohibit placing the merge operator at any location in a map. Neither does it prohibit multiple instances of merge operators in the same map.

Here is a YAML example where the use of the merge operator is not restricted, as in this suggestion:

- base: &base
    nested_mapping: &base_nested_mapping
      key1: value1
      nested_sequence:
        - item1
        - item2
- extension: &extension
    nested_mapping:
      <<: *base_nested_mapping
      key2: value2
      nested_sequence:
        - item3
        - item4
- merged_using_array:
    <<: [ *extension, *base]
- merged_separately:
    <<: *extension
    <<: *base
- merged_interleaved:
    start: marker1
    <<: *extension
    mid: marker2
    <<: *base
    end: marker3

According to YAML Lint, it would expand to

- base:
    nested_mapping:
      key1: value1
      nested_sequence:
        - item1
        - item2
- extension:
    nested_mapping:
      key1: value1
      nested_sequence:
        - item3
        - item4
      key2: value2
- merged_using_array:
    nested_mapping:
      key1: value1
      nested_sequence:
        - item3
        - item4
      key2: value2
- merged_separately:
    nested_mapping:
      key1: value1
      nested_sequence:
        - item3
        - item4
      key2: value2
- merged_interleaved:
    start: marker1
    nested_mapping:
      key1: value1
      nested_sequence:
        - item3
        - item4
      key2: value2
    mid: marker2
    end: marker3

Please feel free to reject this suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I noticed that playing around with pyyaml. Since this approach happens post-parsing and the parser throws a clear exception in this case and you can easily switch to <<: [*a, *b] I think it's an OK limitation. A more fully-thought-out merge operator might support things like merging maps in the order they are specified when the operator appears more than once (and working for lists) but that would require support at the parser level.

@zstadler
Copy link
Contributor

It is worthwhile checking that the code can identify and avoid circular references such as

include_when: &include
  << : *include
  power:
    - tower

Circular references not specific to the merge operator. They can also occur with anchors and aliases:

include_when: &include
  - *include
  - power:
    - tower

Copy link

@msbarry msbarry merged commit 70d5857 into main Sep 29, 2024
12 checks passed
@zstadler
Copy link
Contributor

Thanks @msbarry for taking care of this request so quickly!

BTW, is there some debug switch that can show what the expanded custom YAML file looks like?
I think that would be very useful for utilizing anchors, aliases and the merge operator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] YAML merge operator not supported in custom maps
2 participants